Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
Front Microbiol ; 15: 1373344, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38596376

RESUMO

The DNA damage inducible SOS response in bacteria serves to increase survival of the species at the cost of mutagenesis. The SOS response first initiates error-free repair followed by error-prone repair. Here, we have employed a multi-omics approach to elucidate the temporal coordination of the SOS response. Escherichia coli was grown in batch cultivation in bioreactors to ensure highly controlled conditions, and a low dose of the antibiotic ciprofloxacin was used to activate the SOS response while avoiding extensive cell death. Our results show that expression of genes involved in error-free and error-prone repair were both induced shortly after DNA damage, thus, challenging the established perception that the expression of error-prone repair genes is delayed. By combining transcriptomics and a sub-proteomics approach termed signalomics, we found that the temporal segregation of error-free and error-prone repair is primarily regulated after transcription, supporting the current literature. Furthermore, the heterology index (i.e., the binding affinity of LexA to the SOS box) was correlated to the maximum increase in gene expression and not to the time of induction of SOS genes. Finally, quantification of metabolites revealed increasing pyrimidine pools as a late feature of the SOS response. Our results elucidate how the SOS response is coordinated, showing a rapid transcriptional response and temporal regulation of mutagenesis on the protein and metabolite levels.

2.
medRxiv ; 2024 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-38293138

RESUMO

Neurodevelopmental proteasomopathies represent a distinctive category of neurodevelopmental disorders (NDD) characterized by genetic variations within the 26S proteasome, a protein complex governing eukaryotic cellular protein homeostasis. In our comprehensive study, we identified 23 unique variants in PSMC5 , which encodes the AAA-ATPase proteasome subunit PSMC5/Rpt6, causing syndromic NDD in 38 unrelated individuals. Overexpression of PSMC5 variants altered human hippocampal neuron morphology, while PSMC5 knockdown led to impaired reversal learning in flies and loss of excitatory synapses in rat hippocampal neurons. PSMC5 loss-of-function resulted in abnormal protein aggregation, profoundly impacting innate immune signaling, mitophagy rates, and lipid metabolism in affected individuals. Importantly, targeting key components of the integrated stress response, such as PKR and GCN2 kinases, ameliorated immune dysregulations in cells from affected individuals. These findings significantly advance our understanding of the molecular mechanisms underlying neurodevelopmental proteasomopathies, provide links to research in neurodegenerative diseases, and open up potential therapeutic avenues.

3.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37798252

RESUMO

The emergence of massive datasets exploring the multiple levels of molecular biology has made their analysis and knowledge transfer more complex. Flexible tools to manage big biological datasets could be of great help for standardizing the usage of developed data visualizations and integration methods. Business intelligence (BI) tools have been used in many fields as exploratory tools. They have numerous connectors to link numerous data repositories with a unified graphic interface, offering an overview of data and facilitating interpretation for decision makers. BI tools could be a flexible and user-friendly way of handling molecular biological data with interactive visualizations. However, it is rather uncommon to see such tools used for the exploration of massive and complex datasets in biological fields. We believe that two main obstacles could be the reason. Firstly, we posit that the way to import data into BI tools are not compatible with biological databases. Secondly, BI tools may not be adapted to certain particularities of complex biological data, namely, the size, the variability of datasets and the availability of specialized visualizations. This paper highlights the use of five BI tools (Elastic Kibana, Siren Investigate, Microsoft Power BI, Salesforce Tableau and Apache Superset) onto which the massive data management repository engine called Elasticsearch is compatible. Four case studies will be discussed in which these BI tools were applied on biological datasets with different characteristics. We conclude that the performance of the tools depends on the complexity of the biological questions and the size of the datasets.


Assuntos
Conjuntos de Dados como Assunto , Software , Visualização de Dados
4.
Sci Transl Med ; 15(698): eabo3189, 2023 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-37256937

RESUMO

A critical step in preserving protein homeostasis is the recognition, binding, unfolding, and translocation of protein substrates by six AAA-ATPase proteasome subunits (ATPase-associated with various cellular activities) termed PSMC1-6, which are required for degradation of proteins by 26S proteasomes. Here, we identified 15 de novo missense variants in the PSMC3 gene encoding the AAA-ATPase proteasome subunit PSMC3/Rpt5 in 23 unrelated heterozygous patients with an autosomal dominant form of neurodevelopmental delay and intellectual disability. Expression of PSMC3 variants in mouse neuronal cultures led to altered dendrite development, and deletion of the PSMC3 fly ortholog Rpt5 impaired reversal learning capabilities in fruit flies. Structural modeling as well as proteomic and transcriptomic analyses of T cells derived from patients with PSMC3 variants implicated the PSMC3 variants in proteasome dysfunction through disruption of substrate translocation, induction of proteotoxic stress, and alterations in proteins controlling developmental and innate immune programs. The proteostatic perturbations in T cells from patients with PSMC3 variants correlated with a dysregulation in type I interferon (IFN) signaling in these T cells, which could be blocked by inhibition of the intracellular stress sensor protein kinase R (PKR). These results suggest that proteotoxic stress activated PKR in patient-derived T cells, resulting in a type I IFN response. The potential relationship among proteosome dysfunction, type I IFN production, and neurodevelopment suggests new directions in our understanding of pathogenesis in some neurodevelopmental disorders.


Assuntos
Interferon Tipo I , Complexo de Endopeptidases do Proteassoma , Animais , Humanos , Camundongos , Adenosina Trifosfatases/genética , Drosophila melanogaster , Expressão Gênica , Complexo de Endopeptidases do Proteassoma/metabolismo , Proteômica
5.
Andrology ; 11(5): 927-942, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-36697378

RESUMO

BACKGROUND: DNA methylation (DNAme) erasure and reacquisition occur during prenatal male germ cell development; some further remodeling takes place after birth during spermatogenesis. Environmental insults during germline epigenetic reprogramming may affect DNAme, presenting a potential mechanism for transmission of environmental exposures across multiple generations. OBJECTIVES: We investigated how germ cell DNAme is impacted by lifetime exposures to diets containing either low or high, clinically relevant, levels of the methyl donor folic acid and whether resulting DNAme alterations were inherited in germ cells of male offspring of subsequent generations. MATERIALS AND METHODS: Female mice were placed on a control (FCD), 7-fold folic acid deficient (7FD) or 10- to 20-fold supplemented (10FS and 20FS) diet before and during pregnancy. Resulting F1 litters were weaned on the respective diets. F2 and F3 males received control diets. Genome-wide DNAme at cytosines (within CpG sites) was assessed in F1 spermatogonia, and in F1, F2 and F3 sperm. RESULTS: In F1 germ cells, a greater number of differentially methylated cytosines (DMCs) were observed in spermatogonia as compared with F1 sperm for all folic acid diets. DMCs were lower in number in F2 versus F1 sperm, while an unexpected increase was found in F3 sperm. DMCs were predominantly hypomethylated, with genes in neurodevelopmental pathways commonly affected in F1, F2 and F3 male germ cells. While no DMCs were found to be significantly inherited inter- or transgenerationally, we observed over-representation of repetitive elements, particularly young long interspersed nuclear elements (LINEs). DISCUSSION AND CONCLUSION: These results suggest that the prenatal window is the time most susceptible to folate-induced alterations in sperm DNAme in male germ cells. Altered methylation of specific sites in F1 germ cells was not present in later generations. However, the presence of DNAme perturbations in the sperm of males of the F2 and F3 generations suggests that epigenetic inheritance mechanisms other than DNAme may have been impacted by the folate diet exposure of F1 germ cells.


Assuntos
Metilação de DNA , Deficiência de Ácido Fólico , Gravidez , Masculino , Feminino , Camundongos , Animais , Deficiência de Ácido Fólico/genética , Deficiência de Ácido Fólico/metabolismo , Sêmen/metabolismo , Epigênese Genética , Espermatozoides/metabolismo , Ácido Fólico/metabolismo , Suplementos Nutricionais , Espermatogônias/metabolismo , DNA/metabolismo
6.
Front Mol Biosci ; 9: 962799, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36158572

RESUMO

At the heart of the cellular machinery through the regulation of cellular functions, protein-protein interactions (PPIs) have a significant role. PPIs can be analyzed with network approaches. Construction of a PPI network requires prediction of the interactions. All PPIs form a network. Different biases such as lack of data, recurrence of information, and false interactions make the network unstable. Integrated strategies allow solving these different challenges. These approaches have shown encouraging results for the understanding of molecular mechanisms, drug action mechanisms, and identification of target genes. In order to give more importance to an interaction, it is evaluated by different confidence scores. These scores allow the filtration of the network and thus facilitate the representation of the network, essential steps to the identification and understanding of molecular mechanisms. In this review, we will discuss the main computational methods for predicting PPI, including ones confirming an interaction as well as the integration of PPIs into a network, and we will discuss visualization of these complex data.

7.
Int J Mol Sci ; 23(12)2022 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-35743235

RESUMO

Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20-30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.


Assuntos
Exoma , Doenças Raras , Exoma/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Doenças Raras/diagnóstico , Doenças Raras/genética , Sequenciamento do Exoma/métodos
8.
Front Cell Dev Biol ; 10: 834519, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35392175

RESUMO

Following their production in the testis, spermatozoa enter the epididymis where they gain their motility and fertilizing abilities. This post-testicular maturation coincides with sperm epigenetic profile changes that influence progeny outcome. While recent studies highlighted the dynamics of small non-coding RNAs in maturing spermatozoa, little is known regarding sperm methylation changes and their impact at the post-fertilization level. Fluorescence-activated cell sorting (FACS) was used to purify spermatozoa from the testis and different epididymal segments (i.e., caput, corpus and cauda) of CAG/su9-DsRed2; Acr3-EGFP transgenic mice in order to map out sperm methylome dynamics. Reduced representation bisulfite sequencing (RRBS-Seq) performed on DNA from these respective sperm populations indicated that high methylation changes were observed between spermatozoa from the caput vs. testis with 5,546 entries meeting our threshold values (q value <0.01, methylation difference above 25%). Most of these changes were transitory during epididymal sperm maturation according to the low number of entries identified between spermatozoa from cauda vs. testis. According to enzymatic and sperm/epididymal fluid co-incubation assays, (de)methylases were not found responsible for these sperm methylation changes. Instead, we identified that a subpopulation of caput spermatozoa displayed distinct methylation marks that were susceptible to sperm DNAse treatment and accounted for the DNA methylation profile changes observed in the proximal epididymis. Our results support the paradigm that a fraction of caput spermatozoa has a higher propensity to bind extracellular DNA, a phenomenon responsible for the sperm methylome variations observed at the post-testicular level. Further investigating the degree of conservation of this sperm heterogeneity in human will eventually provide new considerations regarding sperm selection procedures used in fertility clinics.

9.
Nucleic Acids Res ; 50(5): e27, 2022 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-34883510

RESUMO

Multi-omics integration is key to fully understand complex biological processes in an holistic manner. Furthermore, multi-omics combined with new longitudinal experimental design can unreveal dynamic relationships between omics layers and identify key players or interactions in system development or complex phenotypes. However, integration methods have to address various experimental designs and do not guarantee interpretable biological results. The new challenge of multi-omics integration is to solve interpretation and unlock the hidden knowledge within the multi-omics data. In this paper, we go beyond integration and propose a generic approach to face the interpretation problem. From multi-omics longitudinal data, this approach builds and explores hybrid multi-omics networks composed of both inferred and known relationships within and between omics layers. With smart node labelling and propagation analysis, this approach predicts regulation mechanisms and multi-omics functional modules. We applied the method on 3 case studies with various multi-omics designs and identified new multi-layer interactions involved in key biological functions that could not be revealed with single omics analysis. Moreover, we highlighted interplay in the kinetics that could help identify novel biological mechanisms. This method is available as an R package netOmics to readily suit any application.


Assuntos
Genômica , Biologia de Sistemas/métodos , Genômica/métodos , Fenótipo
10.
Bioinformatics ; 38(2): 577-579, 2022 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-34554215

RESUMO

MOTIVATION: Multi-omics data integration enables the global analysis of biological systems and discovery of new biological insights. Multi-omics experimental designs have been further extended with a longitudinal dimension to study dynamic relationships between molecules. However, methods that integrate longitudinal multi-omics data are still in their infancy. RESULTS: We introduce the R package timeOmics, a generic analytical framework for the integration of longitudinal multi-omics data. The framework includes pre-processing, modeling and clustering to identify molecular features strongly associated with time. We illustrate this framework in a case study to detect seasonal patterns of mRNA, metabolites, gut taxa and clinical variables in patients with diabetes mellitus from the integrative Human Microbiome Project. AVAILABILITYAND IMPLEMENTATION: timeOmics is available on Bioconductor and github.com/abodein/timeOmics. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Multiômica , Humanos , Genômica/métodos , Análise por Conglomerados
11.
Epigenomes ; 5(2)2021 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-34968297

RESUMO

Due to the grasshopper effect, the Arctic food chain in Canada is contaminated with persistent organic pollutants (POPs) of industrial origin, including polychlorinated biphenyls and organochlorine pesticides. Exposure to POPs may be a contributor to the greater incidence of poor fetal growth, placental abnormalities, stillbirths, congenital defects and shortened lifespan in the Inuit population compared to non-Aboriginal Canadians. Although maternal exposure to POPs is well established to harm pregnancy outcomes, paternal transmission of the effects of POPs is a possibility that has not been well investigated. We used a rat model to test the hypothesis that exposure to POPs during gestation and suckling leads to developmental defects that are transmitted to subsequent generations via the male lineage. Indeed, developmental exposure to an environmentally relevant Arctic POPs mixture impaired sperm quality and pregnancy outcomes across two subsequent, unexposed generations and altered sperm DNA methylation, some of which are also observed for two additional generations. Genes corresponding to the altered sperm methylome correspond to health problems encountered in the Inuit population. These findings demonstrate that the paternal methylome is sensitive to the environment and that some perturbations persist for at least two subsequent generations. In conclusion, although many factors influence health, paternal exposure to contaminants plays a heretofore-underappreciated role with sperm DNA methylation contributing to the molecular underpinnings involved.

12.
J Clin Med ; 10(20)2021 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-34682802

RESUMO

BACKGROUND: To explore the use of maternal urine proteome for the identification of preeclampsia biomarkers. METHODS: Maternal urine samples from women with and without preeclampsia were used for protein discovery followed by a validation study. The targeted proteins of interest were then measured in urine samples collected at 20-24 and 30-34 weeks among nine women who developed preeclampsia, one woman with fetal growth restriction, and 20 women with uncomplicated pregnancies from a longitudinal study. Protein identification and quantification was obtained using liquid chromatography-tandem mass spectrometry (LC-MS/MS). RESULTS: Among the 1108 urine proteins quantified in the discovery study, 21 were upregulated in preeclampsia and selected for validation. Nineteen (90%) proteins were confirmed as upregulated in preeclampsia cases. Among them, two proteins, ceruloplasmin and serpin A7, were upregulated at 20-24 weeks and 30-34 weeks of gestation (p < 0.05) in cases of preeclampsia, and could have served to identify 60% of women who subsequently developed preeclampsia and/or fetal growth restriction at 20-24 weeks of gestation, and 78% at 30-34 weeks, for a false-positive rate of 10%. CONCLUSIONS: Proteomic profiling of maternal urine can differentiate women with and without preeclampsia. Several proteins including ceruloplasmin and serpin A7 are upregulated in maternal urine before the diagnosis of preeclampsia and potentially fetal growth restriction.

13.
Comput Struct Biotechnol J ; 19: 3735-3746, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34285775

RESUMO

Increased availability of high-throughput technologies has generated an ever-growing number of omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these data have been obtained by machine learning algorithms that have produced diagnostic and classification biomarkers. Most biomarkers obtained to date however only include one omic measurement at a time and thus do not take full advantage of recent multi-omics experiments that now capture the entire complexity of biological systems. Multi-omics data integration strategies are needed to combine the complementary knowledge brought by each omics layer. We have summarized the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical. In this mini-review, we focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications.

14.
BMC Bioinformatics ; 22(1): 267, 2021 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-34034647

RESUMO

BACKGROUND: Network-based analysis of gene expression through co-expression networks can be used to investigate modular relationships occurring between genes performing different biological functions. An extended description of each of the network modules is therefore a critical step to understand the underlying processes contributing to a disease or a phenotype. Biological integration, topology study and conditions comparison (e.g. wild vs mutant) are the main methods to do so, but to date no tool combines them all into a single pipeline. RESULTS: Here we present GWENA, a new R package that integrates gene co-expression network construction and whole characterization of the detected modules through gene set enrichment, phenotypic association, hub genes detection, topological metric computation, and differential co-expression. To demonstrate its performance, we applied GWENA on two skeletal muscle datasets from young and old patients of GTEx study. Remarkably, we prioritized a gene whose involvement was unknown in the muscle development and growth. Moreover, new insights on the variations in patterns of co-expression were identified. The known phenomena of connectivity loss associated with aging was found coupled to a global reorganization of the relationships leading to expression of known aging related functions. CONCLUSION: GWENA is an R package available through Bioconductor ( https://bioconductor.org/packages/release/bioc/html/GWENA.html ) that has been developed to perform extended analysis of gene co-expression networks. Thanks to biological and topological information as well as differential co-expression, the package helps to dissect the role of genes relationships in diseases conditions or targeted phenotypes. GWENA goes beyond existing packages that perform co-expression analysis by including new tools to fully characterize modules, such as differential co-expression, additional enrichment databases, and network visualization.


Assuntos
Redes Reguladoras de Genes , Software , Expressão Gênica , Perfilação da Expressão Gênica , Humanos
15.
Bioinformatics ; 37(17): 2706-2713, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-33751043

RESUMO

MOTIVATION: The growing production of massive heterogeneous biological data offers opportunities for new discoveries. However, performing multi-omics data analysis is challenging, and researchers are forced to handle the ever-increasing complexity of both data management and evolution of our biological understanding. Substantial efforts have been made to unify biological datasets into integrated systems. Unfortunately, they are not easily scalable, deployable and searchable, locally or globally. RESULTS: This publication presents two tools with a simple structure that can help any data provider, organization or researcher, requiring a reliable data search and analysis base. The first tool is Kibio, a scalable and adaptable data storage based on Elasticsearch search engine. The second tool is KibioR, a R package to pull, push and search Kibio datasets or any accessible Elasticsearch-based databases. These tools apply a uniform data exchange model and minimize the burden of data management by organizing data into a decentralized, versatile, searchable and shareable structure. Several case studies are presented using multiple databases, from drug characterization to miRNAs and pathways identification, emphasizing the ease of use and versatility of the Kibio/KibioR framework. AVAILABILITYAND IMPLEMENTATION: Both KibioR and Elasticsearch are open source. KibioR package source is available at https://github.com/regisoc/kibior and the library on CRAN at https://cran.r-project.org/package=kibior. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

16.
Genes Nutr ; 15(1): 21, 2020 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-33243154

RESUMO

BACKGROUND: Increased adipogenesis and altered adipocyte function contribute to the development of obesity and associated comorbidities. Fructose modified adipocyte metabolism compared to glucose, but the regulatory mechanisms and consequences for obesity are unknown. Genome-wide methylation and global transcriptomics in SGBS pre-adipocytes exposed to 0, 2.5, 5, and 10 mM fructose, added to a 5-mM glucose-containing medium, were analyzed at 0, 24, 48, 96, 192, and 384 h following the induction of adipogenesis. RESULTS: Time-dependent changes in DNA methylation compared to baseline (0 h) occurred during the final maturation of adipocytes, between 192 and 384 h. Larger percentages (0.1% at 192 h, 3.2% at 384 h) of differentially methylated regions (DMRs) were found in adipocytes differentiated in the glucose-containing control media compared to adipocytes differentiated in fructose-supplemented media (0.0006% for 10 mM, 0.001% for 5 mM, and 0.005% for 2.5 mM at 384 h). A total of 1437 DMRs were identified in 5237 differentially expressed genes at 384 h post-induction in glucose-containing (5 mM) control media. The majority of them inversely correlated with the gene expression, but 666 regions were positively correlated to the gene expression. CONCLUSIONS: Our studies demonstrate that DNA methylation regulates or marks the transformation of morphologically differentiating adipocytes (seen at 192 h), to the more mature and metabolically robust adipocytes (as seen at 384 h) in a genome-wide manner. Lower (2.5 mM) concentrations of fructose have the most robust effects on methylation compared to higher concentrations (5 and 10 mM), suggesting that fructose may be playing a signaling/regulatory role at lower concentrations of fructose and as a substrate at higher concentrations.

17.
Int J Mol Sci ; 21(5)2020 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-32156068

RESUMO

Growing evidence demonstrates that epithelial-mesenchymal transition (EMT) plays an important role in epithelial ovarian cancer (EOC) progression and spreading; however, its molecular mechanisms remain poorly defined. We have previously shown that the antigen receptor LY75 can modulate EOC cell phenotype and metastatic potential, as LY75 depletion directed mesenchymal-epithelial transition (MET) in EOC cell lines with mesenchymal phenotype. We used the LY75-mediated modulation of EMT as a model to investigate for DNA methylation changes during EMT in EOC cells, by applying the reduced representation bisulfite sequencing (RRBS) methodology. Numerous genes have displayed EMT-related DNA methylation patterns alterations in their promoter/exon regions. Ten selected genes, whose DNA methylation alterations were further confirmed by alternative methods, were further identified, some of which could represent new EOC biomarkers/therapeutic targets. Moreover, our methylation data were strongly indicative for the predominant implication of the Wnt/ß-catenin pathway in the EMT-induced DNA methylation variations in EOC cells. Consecutive experiments, including alterations in the Wnt/ß-catenin pathway activity in EOC cells with a specific inhibitor and the identification of LY75-interacting partners by a proteomic approach, were strongly indicative for the direct implication of the LY75 receptor in modulating the Wnt/ß-catenin signaling in EOC cells.


Assuntos
Antígenos CD/genética , Carcinoma Epitelial do Ovário/patologia , Metilação de DNA/genética , Transição Epitelial-Mesenquimal/genética , Lectinas Tipo C/genética , Antígenos de Histocompatibilidade Menor/genética , Neoplasias Ovarianas/patologia , Receptores de Superfície Celular/genética , Via de Sinalização Wnt/genética , beta Catenina/antagonistas & inibidores , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Interferência de RNA , RNA Interferente Pequeno/genética
18.
Front Genet ; 10: 452, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31156708

RESUMO

The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML.

19.
Front Genet ; 10: 1349, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32010198

RESUMO

One of the most challenging tasks of the post-genome-wide association studies (GWAS) research era is the identification of functional variants among those associated with a trait for an observed GWAS signal. Several methods have been developed to evaluate the potential functional implications of genetic variants. Each of these tools has its own scoring system, which forces users to become acquainted with each approach to interpret their results. From an awareness of the amount of work needed to analyze and integrate results for a single locus, we proposed a flexible and versatile approach designed to help the prioritization of variants by aggregating the predictions of their potential functional implications. This approach has been made available through a graphical user interface called DSNetwork, which acts as a single point of entry to almost 60 reference predictors for both coding and non-coding variants and displays predictions in an easy-to-interpret visualization. We confirmed the usefulness of our methodology by successfully identifying functional variants in four breast cancer and nine schizophrenia susceptibility loci.

20.
Brief Bioinform ; 20(4): 1269-1279, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29272335

RESUMO

With the recent developments in the field of multi-omics integration, the interest in factors such as data preprocessing, choice of the integration method and the number of different omics considered had increased. In this work, the impact of these factors is explored when solving the problem of sample classification, by comparing the performances of five unsupervised algorithms: Multiple Canonical Correlation Analysis, Multiple Co-Inertia Analysis, Multiple Factor Analysis, Joint and Individual Variation Explained and Similarity Network Fusion. These methods were applied to three real data sets taken from literature and several ad hoc simulated scenarios to discuss classification performance in different conditions of noise and signal strength across the data types. The impact of experimental design, feature selection and parameter training has been also evaluated to unravel important conditions that can affect the accuracy of the result.


Assuntos
Biologia Computacional/métodos , Integração de Sistemas , Aprendizado de Máquina não Supervisionado , Algoritmos , Animais , Análise por Conglomerados , Simulação por Computador , Bases de Dados Factuais , Análise Fatorial , Genômica/estatística & dados numéricos , Humanos , Metabolômica/estatística & dados numéricos , Camundongos , Modelos Biológicos , Análise Multivariada , Proteômica/estatística & dados numéricos , Biologia de Sistemas , Aprendizado de Máquina não Supervisionado/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...